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Abstract 

Using self-organized polymer models, we predict mechanical unfolding and refolding pathways of 
ribo-zymes, and the green fluorescent protein. In agreement with experiments, there are between six 
and eight unfolding transitions in the Tetrahymena ribozyme. Depending on the loading rate, the 
number of rips in the force-ramp unfolding of the Azoarcus ribozymes is between two and four. Force- 
quench refolding of the P4-P6 subdomain of the Tetrahymena ribozyme occurs through a compact 
intermediate. Subsequent formation of tertiary contacts between helices P5b-P6a and P5a/P5c-P4 
leads to the native state. The force-quench refolding pathways agree with ensemble experiments. In 
the dominant unfolding route, the N-terminal a helix of GFP unravels first, followed by disruption 
of the N terminus b strand. There is a third intermediate that involves disruption of three other 
strands. In accord with experiments, the force-quench refolding pathway of GFP is hierarchic, with 
the rate-limiting step being the closure of the barrel. 
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I. INTRODUCTION 



Despite significant advances (Onuchic and Wolynes, 2004; Thirumalai and Hyeon, 2005), 
major unsolved problems remain in our understanding of how monomeric RNA and protein 
molecules navigate the rough energy landscape to reach their folded states. Single-molecule 
experiments, which use mechanical force to manipulate the initial conformations, have begun to 
provide a deeper understanding of the folding mechanisms of proteins (Fernandez and Li, 2004; 
Cecconi et al., 2005) and RNA (Onoa et al., 2003; Liphardt et al., 2001). A combination of 
forced unfolding and force-quench refolding of a number of proteins (Li et al., 2006; Fernandez 
and Li, 2004; Cecconi et al., 2005; Best and Hummer, 2005; Isra-lewitz et al., 2001; Gerland 
et al., 2003) and RNA, including the polynucleotide L-21 Tetrahymena thermophila ribozyme 
(Onoa et al., 2003), and the large (230 amino acid residues) green fluorescent protein (GFP) 
(Dietz and Rief, 2004) has been used to map the energy landscape of RNA and proteins. These 
experiments identify kinetic barriers and the nature of intermediates by using mechanical un- 
folding or refolding trajectories that monitor the end-to-end distance (R(t)) of the molecule in 
real time (t) or from the force-extension curves (FECs). 

It is difficult to unambiguously infer the structural details of the intermediates by using 
only R(t) and FEC. Moreover, assigning kinetic barriers from FECs or from the distribution of 
unbinding forces (or rates) is not always unique (Derenyi et al., 2004; Hyeon and Thirumalai, 
2006). The power of single-molecule force spectroscopy is enhanced when combined with reliable 
computations that can be carried out under conditions that mimic the experimental conditions 
as closely as possible. Toward this end, we use a self-organized polymer (SOP) model (see 
Experimental Procedures) to predict the forced unfolding and force-quench refolding of the L-21 
Tetra-hymena thermophila ribozyme and GFP. Several studies (Chen and Dill, 2000; Treiber 
and Williamson, 2001; Sosnick and Pan, 2003; Thirumalai et al., 2001; Das et al., 2003) have 
shown that the RNA energy landscape is rugged. The SOP model is also used to obtain a 
number of new, to our knowledge, results for mechanical unfolding and force-quench refolding 
of the large-sized protein GFP whose folding has been difficult to probe by using conventional 
experiments because the slow folding times often lead to aggregation (Zimmer, 2002). As a 
result, only a few ensemble folding experiments for GFP (Zimmer, 2002; Fukuda et al., 2000), 
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which is used as a marker in a number of biotechnology applications that include its use as a 
reporter gene and as a fusion tag to visualize cellular events, have been performed. 

Here, we make significant advances in using coarse-grained models to study single-molecule 
force spectroscopy of large RNA and proteins. The use of the SOP model has enabled 
us to probe the structural details of the forced-unfolding pathways of the T. thermo-phila 
ribozyme and related ribozymes and GFP over a wide range of loading rates. For RNA 
and proteins, the dominant unfolding pathway depends on the loading rate, Tf. After 
establishing the validity of the method by successfully obtaining the experimentally inferred 
major unfolding pathway for T. thermophila ribozyme, we predict the order of r/-dependent 
unfolding events in the Azoarcus ribozyme. Application to GFP reveals the structural details 
of the intermediates identified in forced-unfolding AFM experiments. Refolding simulations 
upon force quench of the independently folding subdomain P4-P6 of the L-21 construct 
and GFP show that the assembly of these molecules from stretched states occurs in stages. 
In both cases, tertiary interactions that stabilize the native conformation form between 
preformed secondary structural elements. Thus, upon force quench, P4-P6 and GFP refold 
in a hierarchical manner (Brion and Westhof, 1997; Scalvi et al., 1998; Baldwin and Rose, 1999). 



II. RESULTS AND DISCUSSION 



Summary of the SOP Model: Before presenting the results of our work on a variety 
of systems, it is useful to discuss more fully the advantages and the limitations of the SOP 
model. In order to predict the pathways in forced unfolding and force-quench re-folding of 
proteins and RNA under conditions (pulling speeds or loading rates) that are close to those 
used in experiments, it is necessary to use coarse-grained models (see Experimental Procedures). 
The effective interactions between the sites in the coarse-grained representation of proteins 
and RNA involve averaging over degrees of freedom that cannot be easily or fully resolved 
in experiments. This is the case in laser optical tweezer (LOT) and atomic force microscopy 
(AFM) experiments, which cannot resolve structures on length scales that are much smaller 
than about 1 nm. Moreover, in the interpretation of the FECs of RNA and proteins, it is 
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tacitly assumed that unraveling of secondary structures occurs in blocks. In other words, the 
released length that corresponds to a given number of nucleotides or residues is assigned to 
the rupture of specific secondary structures in the case of proteins, or hairpins in the case of 
RNA. With these observations, we constructed the SOP model and kept only one interaction 
site for each nucleotide or amino acid residue. Such a procedure for coarse graining has already 
been used in building models for much larger complexes (Sali et al., 2003), where it is not 
possible (or necessary) to take into account atomic details. The energy function that we chose is 
simple and consists of terms that are normally employed in more elaborate descriptions. They 
include chain connectivity and interactions that stabilize the native structures. In the current 
version, we neglected interactions between residues or nucleotides that are not present in the 
native structure, i.e., no attractive non-native interactions are allowed. Neglect of nonnative 
interactions will not affect the FECs qualitatively or quantitatively because, as stated above, 
in the current experimental setup, only unraveling that occurs as secondary structure blocks 
(proteins and RNA) is resolved. Furthermore, in the analysis of FEC it is assumed that once 
a given local secondary structure unravels it remains stretched until the molecule fully extends. 
To a large extent, the present computations on unfolding of the ribozyme and GFP support 
such an interpretation. 

The simplicity of the SOP model allows us to use pulling speeds that are employed in AFM 
experiments. As a result, the forces predicted for GFP are in near quantitative agreement with 
measured values (see below). In the case of RNA, the pulling speeds used in the simulations are 
about three orders of magnitude greater than in the LOT experiments. Therefore, the predicted 
unfolding forces are higher. In contrast to our simulations, the pulling speeds used in all-atom 
molecular dynamics simulations are between six and eight orders of magnitude greater than 
in AFM experiments and are nearly ten orders of magnitude larger than in LOT experiments. 
Thus, it is not possible to reproduce the FECs (the experimental observable) to the accuracy 
reported here by using model force fields employed in all-atom molecular dynamics simulations. 
The present approach should be viewed as complementary to the more elaborate models that are 
often used to provide insights into the role that solvent plays in facilitating mechanical unfolding 
(Isralewitz et al., 2001; Gao et al., 2002). 

In contrast to forced unfolding, the neglect of nonna-tive interactions can affect force-quench 
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refolding pathways and timescales. The exclusive use of native interactions minimizes energetic 
frustration and helps in creating a positive gradient toward the native structure. However, for 
large molecules (such as GFP) there is a distinct possibility of the molecule being topologically 
entangled (even when only interactions between contacts in the native state are included) 
because of the complexity of the native state structures. Hence, our findings (see below) 
that force-quench refolding leads to pathways consistent with those inferred from ensemble 
experiments for both GFP and the P4-P6 subdomain are remarkable (see below) and highly 
nontrivial. It should be stressed that the use of other more computationally demanding models 
cannot even begin to simulate force-quench refolding. Given the limitations of other computa- 
tional methods, the insights gained regarding the predictions for force-quench simulations with 
the SOP model are encouraging. 

The L-21 T. thermophila Ribozyme : The folding of the L-21 construct of the Tetrahy- 
mena ri-bozyme (Figure 1A) and its independently folding subdomains (P4-P6 and P5abc) in 
various ionic conditions have been extensively investigated (Das et al., 2003; Thirumalai et al., 
2001; Treiber and Williamson, 2001). By probing the unfolding characteristics of increasingly 
larger constructs of the L-21 ribozyme by using LOT experiments, Onoa et al. (2003) were able 
to associate the force peaks in the FECs to rips (or rupture) of specific substructures. By using 
this strategy and two other methods, Onoa et al. (2003) have provided an outline of the forced- 
unfolding pathway of RNA. They assumed that extension by a certain length corresponds to 
unraveling of the entire helical substructures. With this assumption, the unfolding pathway of 
ribozymes can be inferred from FECs alone. In the presence of Mg 2+ , the FEC for the L-21 T. 
thermophila ribozyme has eight peaks. It is difficult to unambiguously assign the specific paired 
helices that unravel in the absence of the structure of the T. thermophila ribozyme. The num- 
ber of peaks in the FECs also varies depending on the specific molecule that is being stretched. 
In addition, there are multiple unfolding routes (Onoa et al., 2003) that may be indicative of 
heterogeneity in force-induced unfolding. 

As a first step in the validation of the SOP model, we computed the FEC for the T. ther- 
mophila ribozyme at three loading rates. The Westhof model (The atomic coordinate of the 
T. thermophila ribozyme, TtLSU.pdb, was obtained from the Group I and II sections in the 
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website http://www-ibmc.u-strasbg.fr/upr9002/westhof.) (Lehnert et al., 1996) was used as 



the initial conformation in the Brownian dynamics simulations. In agreement with LOT ex- 
periments (Onoa et al., 2003), we find that in the majority of unfolding cases the FECs have 
about eight peaks (Figures IB-ID). It should be emphasized that the number of peaks varies 
from molecule to molecule just as in LOT experiments. Such variations from sample to sample 
are characteristic of single-molecule experiments. By explicitly comparing the FECs and the 
dynamics of the rupture history (Figures 1C and ID), we can read off the molecular events lead- 
ing to the rips. We find two major classes of unfolding pathways. One is [N]— >[P9.2]— »[P9.1, 
P9, P9.1a]^[P2]^[P2.1]^[P3, P7, P8]-*[P6]->[P4, P5]^[P5a, P5b, P5c] (Figure 1C). Helices 
in square brackets unravel nearly simultaneously. Unfolding can also occur by an alternate 
route in which the order of rupture is [N]->[P2]->[P2.1]->[P9.2]->[P9, P9.1, P9.1a]->[P3, P7, 
P8]— >[P6]— >[P4, P5]— >[P5a, P5b, P5c] (Figure ID). The difference between the two pathways is 
the switch in the order of unfolding of the peripheral domains (P2 and P9). Given the topology 
of the native structure, this is a significant variation (see the end of the subsection). The exper- 
imentally inferred pathway is [N]->[P9.2]->[P9.1]-»[P9, P9.1a]->[P2, P2.1]->[P3, P7, P8]->[P6, 
P4]— >[P5]— >[P5a, P5b, P5c]. The simulation results for the first pathway are in excellent agree- 
ment with most probable pathway inferred from experiments. 

The structures in Figure 2 give a visual representation of the conformational changes that 
occur in the unfolding transition. The advantage of the simulations is that they provide the 
structural details, albeit at the coarse-grained level, of the populated intermediates and an 
estimate of their lifetimes. Because of the differences between the loading rates and the spring 
constant used in the simulations (see caption to Figure 1) and experiments, the predicted FECs 
do not quantitatively agree with the measurements. However, the order of unfolding of the 
helices and the heterogeneous nature of the unfolding pathways are in very good accord with 
experiments. 

A few additional comments about our results are worth making. First, both simulations and 
experiments (Onoa et al., 2003) find that the peripheral domains unravel before disruption of 
the tertiary interactions involving the catalytic core. Complete rupture occurs when helices P6, 
P4, and P5abc unfold. Second, the unfolding pathways depend critically on the loading rate, 
Tf. At the lowest loading rate in our simulations, the predicted unfolding pathways coincide 



with the results of Onoa et al. (2003). As 77 increases by an additional factor of four, we find 
that the catalytic core (P3-P7-P8) interactions unravel before P2 (data not shown). At even 
higher loading rates, the T. thermophila ribozyme unfolds sequentially, starting from the P9 
domain and ending at the P2 domain. The order of unfolding is determined by 77 and by the 
rate at which tension propagates along the structure. Third, it is worth stressing that the two 
unfolding pathways are not trivially related to each other. In one pathway, unfolding is initiated 
from P2, while in the other unraveling starts from the P9 end. From a structural perspective, 
P2 forms tertiary interactions with P5c (Figure 1A), whereas the P9 helix is in contact with P5. 
The free energies of the tertiary interactions involving the P2 and P9 domains are also different. 
Thus, from both the energetic and structural considerations, the differences in the unfolding 
pathways are significantly different. Accurate prediction of the pathways and the associated 
rj-dependent amplitudes requires a combination of simulations and experiments. Fourth, the 
rips corresponding to the peripheral domains P9 in the simulations are [P9.2]— >[P9.1, P9, 
P9.1a], whereas in the experiments three rips corresponding to [P9.2]— >[P9.1]— >[P9, P9.1a] are 
identified. The two rips corresponding to [P2]— >[P2.1] from the peripheral domains P2 also 
varies from the single rip [P2, P2.1] in the experiment. The minor differences are due to the 
slight variations in the constructs used in experiments and simulations. The LOT experiments 
used the L-21 construct that contains 390 nucleotides whose secondary structure map shows 
(Onoa et al., 2003) that PI is not present. The T. thermophila ribozyme used in the simulations 
is longer (407 nucleotides) and, in addition to the presence of the PI helix, has a slightly longer 
extension at the 3' end. 

Forced-Unfolding Pathways of the Azoarcus Ribozyme Depend on the Loading 
Rate: An important prediction of the SOP model for the T. thermophila ribozyme is that the 
very nature of the unfolding pathways can drastically change depending on 77. This suggests 
that outcomes of unfolding by LOT and AFM experiments can be different. In addition, predic- 
tions of forced unfolding based on all-atom MD simulations should also be treated with caution 
unless, for topological reasons (as in the Ig27 domain from muscle protein titin) (Isralewitz et al., 
2001; Klimov and Thirumalai, 2000a; R.I.D. and D.T., unpublished data), the unfolding path- 
ways are robust to large variations in the loading rates. In order to fully explore the origins of 
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the changes in the unfolding pathways as 77 is varied, we have simulated the rip dynamics of the 
Azoarcus ribozyme (Figure 3A) for which experimental data are not yet available. The structure 
of the smaller (195 nt) Azoarcus ribozyme (Rangan et al., 2003) (PDB code: lu6b) is similar to 
the catalytic core of the T. thermophila ribozyme including the presence of a pseudoknot. The 
reduction in the number of nucleotides allows us to explore forced unfolding over a wide range of 
loading conditions. For the Azoar-cus ribozyme, we generated ten mechanical unfolding trajec- 
tories at three loading rates. At the highest loading rate (77 2.4 x 10 5 rj OT ), the FEC has six 
conspicuous rips (red FEC in Figure 3B), whereas at the lower 77 the number of peaks is reduced 
to between two and four. We identify the structures in each rip by comparing the FECs (Figure 
3B) with the history of rupture of contacts (Figure 3C). At the highest loading rate, the dom- 
inant unfolding pathway of the Azoarcus ribozyme is N-»[P5]-»[P6]-»[P2]-»[P4]-»[P3]-»[P1]. 
At medium loading rates, the ribozyme unfolds via N— >-[Pl, P5, P6]— >[P2]— >[P4]— >[P3], which 
leads to four rips in the FECs. At the lowest loading rate, the number of rips is further reduced 
to two, which we identify with N— >[P1, P2, P5, P6]— >[P3, P4]. Unambiguously identifying the 
underlying pulling speed-dependent conformational changes requires not only the FECs, but 
also the history of rupture of contacts (Figure 3C). 

To understand the profound changes in the unfolding pathways as 77 is varied, it is necessary 
to compare 77 with r T , the rate at which the applied force propagates along RNA (or proteins). 
In both AFM and LOT experiments, force is applied to one end of the chain (30 end) while 
the other end is fixed. The initially applied tension propagates over time in a nonuniform 
fashion through a network of interactions that stabilize the native conformation. The variable 
A = rr/rf determines the rupture history of the biomolecules. If A ^> 1, then the applied tension 
at the 50 end of the RNA propagates rapidly so that, even prior to the realization of the first 
rip, force along the chain is uniform. This situation pertains to the LOT experiments (low 77). 
In the opposite limit, A <C 1, the force is nonuniformly felt along the chain. In such a situation, 
unraveling of RNA begins in regions in which the value of local force exceeds the tertiary 
interactions. Such an event occurs close to the end at which the force is applied. The intuitive 
arguments given above are made precise by computing the rate of propagation of force along 
the Azoarcus ribozyme. To visualize the propagation of force, we computed the dynamics of 
alignment of the angles between the bond segment vector (r^+i) and the force direction during 

9 



the unfolding process (Figures 3D & 3F). The nonuniformity in the local segmental alignment 
along the force direction, which results in a heterogeneous distribution of times in which segment 
vectors approximately align along the force direction, is most evident at the highest loading 
rate (Figure 3E). Interestingly, the dynamics of the force propagation occurs sequentially from 
one end of the chain to the other at high 77. Direct comparison of the differences in the 
alignment dynamics between the first (#1) and last angles (0 N _i) (see Figure 3D) illustrates the 
discrepancy in the force values between the 30 and 50 ends (Figure 3F). There is nonuniformity 
in the force values at the highest 77, whereas there is a more homogeneous alignment at low 
77. The microscopic variations in the dynamics of tension propagation are reflected in the rup- 
ture kinetics of tertiary contacts (Figure 3C) and, hence, in the dynamics of the rips (Figure 3B). 

Force-Quench Refolding of the P4-P6 Domain: Folding of the T. thermophila ribozyme 
induced by increasing counterions is complicated because of pausing in kinetic traps. In order to 
dissect the folding pathways of the larger ribozyme the independently folding P4-P6 domain has 
often been studied (Laederach et al., 2006; Uchida et al., 2002; Deras et al., 2000). Because of 
the interplay of a number of distinct timescales, it has not yet been possible to unambiguously 
produce the refolding pathways by using bulk experiments alone. In principle, mechanical force 
can be used as a way to initiate folding, as was shown in the context of protein folding (Fernandez 
and Li, 2004). We have followed the AFM experimental procedure to monitor refolding by 
quenching, /, from a high to a low (/q) value. We used force-quench simulations to predict the 
refolding dynamics of P4-P6, starting from an initially stretched state. By setting f Q = 0, we 
monitor the dynamics of the transition from a low-entropy (rod-like initial state) structure to 
a folded low-entropy final state. We generated 20 mechanical refolding trajectories (see Figure 
4A for an example) for the P4-P6 domain (the domains enclosed by the blue rectangle in Figure 
1A) by using the coordinates of the crystal structure (PDB code: lgid) in the simulations. The 
time dependence of formation of the fraction of native contacts, Q(t), shows that the approach 
to the native state conformations occurs in steps (Figure 4A). After an initial rapid increase 
in Q(t) in t ~ 2ms, there is a long pause (ranging from 2 to 15 ms) in a metastable state 
with Q(t) ~ 0.65. A rapid cooperative transition to the folded state occurs at t =~ 6ms, 
with Q(t ~ 6ms) « 0.8. The ends of the chain, monitored with R(t), reach the native value 
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rapidly (Figure 4B). The compaction of the P4-P6 domain that is monitored by using the radius 
of gyration (R G (t)) decreases in two distinct stages. After a rapid initial collapse that results 
in a sharp reduction of Rc{t) by about 10 nm, a more gradual approach to the native value 
{R% = 3.0nm) occurs (Figure 4B). The time dependence of the root mean square deviation, 
A(t), decreases in stages and abruptly drops around t ~ 6ms from ~ 3nm to ~ 0.7nm (Figure 
4B). The two-stage compaction of RNA appears to be a robust mechanism of chain collapse. 
Indeed, ensemble experiments that have probed the equilibrium changes in RG of the P4-P6 
domain by using small-angle X-ray scattering experiments also show a two-stage relaxation to 
the native state (Takamoto et al., 2004). As the counterion concentration increases, the value 
of RG decreases without forming tertiary interaction. At a higher ion concentration, the native 
state is reached with the formation of tertiary interactions. The dynamic processes shown in 
Figure 4B mirror the equilibrium pathway inferred from SAXS experiments. In contrast to 
folding initiated by increasing the counterion concentration, much larger changes in RG occur 
in the early stages when force quench begins from a fully stretched state. 

Examination of the tertiary contact formation at the nucleotide level (Figure 4C) shows that 
the assembly of the native structure is initiated from the P5b and P6a hairpin loop within about 
5 ms. Subsequently, zipping of the secondary structures takes place (the structure in Figure 
4A). Formation of loops from the initial single-stranded structure is the key nucleation event 
that triggers formation of the helices. The rate-determining step in the native state formation 
involves a search in the ensemble of conformations with preformed secondary structures. Upon 
cooperative formation of the tertiary contacts involving P5b-P6a and P5a/P5c-P4 (Figure 4C), 
structural transition to the native state occurs. The force-quench refolding mechanism reveals 
the hierarchical nature of RNA structures (Brion and Westhof, 1997; Scalvi et al., 1998). It 
has been difficult to precisely pinpoint the refolding of the various structural elements of the 
P4-P6 domain by using ensemble experiments (Uchida et al., 2002; Laederach et al., 2006). 
However, a wealth of experimental data and recent novel analysis methods (Laederach et al., 
2006) reveal that the early formation of hairpins in the P5abc domain (nearly simultaneous 
formation of P5c and P5b) directs the refolding of the P4-P6 domain of the Tetrahymena 
ribozyme. These events are followed by tertiary interactions between P5b-P6a and the helices 
from the P5abc subdomain and the P4 and P6 helices. Remarkably, the inferred pathway from 
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ensemble experiments (Laederach et al., 2006; Uchida et al., 2002) is almost coincident with 
the force-quench simulations presented here. The limited comparison of the relaxation of RG 
and the approach to the native state obtained by using simulations and experiments shows 
that the SOP model is successful in identifying refolding pathways of complex RNA structures. 
It would be interesting to experimentally validate the predictions by performing force-quench 
experiments at the single-molecule level. 

III. APPLICATION TO PROTEINS 

The versatility of the SOP model is demonstrated by making several nontrivial predictions 
for forced unfolding and force-quench refolding of GFP that has a complex three-dimensional 
structure (Figure 5A). We undertook these simulations to provide insights into constant loading 
rate AFM experiments on GFP that were used to construct its partial energy landscape (Dietz 
and Rief, 2004). In the AFM experiments, two unfolding intermediates were identified. Dis- 
ruption of HI (Figure 5A) results in the first intermediate, GFPDa. The second intermediate, 
GFPDaDb, was conjectured to be either unraveling of (31 from the N terminus or (311 from 
the C terminus (Figure 5A). Both (31 and (311 have the same number of residues, making the 
assignment of the strand that unravels first by using FEC alone impossible (Dietz and Rief, 
2004). In general, precise assignment of the structural characteristics of the intermediates with 
FEC alone is difficult not only because of the complex topology of GFP, but also because, unlike 
in RNA, the substructures of GFP may be unstable. More generally, because secondary struc- 
tures in proteins are typically unstable in the absence of tertiary interactions, it is impossible 
to obtain the unfolding pathways from FEC alone. The native state of GFP (PDB code: lgfl; 
Figure 5A) consists of 11 (3 strands, 3 helices, and 2 relatively long loops. A two-dimensional 
connectivity map of the (3 strands shows that (34, /35, (36 and (37, (38, (39 are essentially dis- 
jointed from the rest of the structure (Figure 5B). From the structure alone we expect that the 
strands in the substructures {D(3\ = [(3 A, (35, (3Q}) and (D(3 2 = [(37, (38, (39]) would unravel almost 
synchronously. However, it is not possible to predict the order of unfolding, the diversity of the 
unfolding pathways, or the number of intermediates without reliable computations with pulling 
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speeds that match those used in the AFM experiments (Dietz and Rief, 2004). We probed the 
structural changes that accompany the forced unfolding of GFP by using FECs and the dynam- 
ics of rupture of contacts at v — 2.5fim/s (~ 2.5vafm)- The FECs in a majority of molecules 
have several peaks (Figure 5C) that represent unfolding of specific secondary structural ele- 
ments. By following the dynamics of residue-dependent breakup of contacts (Figure 5D), the 
structures that unravel can be unambiguously assigned to the FEC peaks. Unfolding begins 
with the rupture of HI (leading to the intermediate GFPAa), which results in the extension 
by about Az pa 3.2nm (Figure 5C). The force required to disrupt HI is about 50 pN (Figure 
5C), which compares well with the experimental estimate of z35 pN (Dietz and Rief, 2004) at 
the lower pulling speed. In the second intermediate, GFPAaA/3, (31 unfolds (Dietz and Rief, 
2004). The value of the force required to unfold (31 is about 100 pN (Figure 5C), which is also 
roughly in accord with the experiment (Dietz and Rief, 2004). The measured unfolding force 
that corresponds to the second intermediate ranges from 70 to 100 pN (Dietz and Rief, 2004). 
After the initial events, the unfolding process is complex. For example, ruptured interactions 
between strands (32 and (33 transiently reform (Figures 5C and 5D). The last two rips represent 
unraveling of D(3i and D/3 2 , in which the strands in D(3i and D/3 2 unwind nearly simultaneously. 
The structures that remain after the various rips (labeled [i] — [iv] in Figure 5D) are shown in 
Figure 5E. 

Besides the dominant pathway, there is a parallel unfolding route in some of the trajectories. 
In the alternate pathways (Figure 6A), the C terminus strand (311 unfolds after the formation 
of GFPAa (Figure 6B). In both the dominant and the subdominant routes, the simulations 
identify multiple intermediates. To assess if the intermediates in the dominant pathway are 
too unstable to be detected experimentally, we have calculated the accessible surface area of 
the substructures by using the PDB coordinates for GFP. The structures of the intermediates 
are assumed to be the same upon rupture of the secondary structural elements, and hence our 
estimate of surface lower bound. The percentage of exposed hydrophobic residues 

in the intermediate [/92, /33, /311] is 25%, compared to 17.4% for the native fold, whereas in 
excess of 60% of the hydrophobic residues in AD(3 2 are solvent accessible. We conclude that the 
intermediate [(32, (33, (311], in which HI, (31 — (33, and (311 partially unfolds, is stable enough 
to be detected. However, the lifetimes of the late-stage intermediates are likely to be too short 
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for experimental detection. In the subdominant unfolding route, the barrel flattens after the 
rupture of /51 1 , thus exposing in excess of 50% of hydrophobic residues. As a result, we predict 
that there are only two detectable intermediates. 

Refolding of GFP upon Force Quench To initiate refolding, we reduced the force on 
the fully stretched GFP to /q = 0. Formation of secondary structures and establishment of 
a large number of tertiary contacts occurs rapidly in w0.25 ms (Figure 7). Subsequently, the 
molecule pauses in a metastable intermediate state with Q(t) pa 0.7 in which all of the secondary 
structural elements are formed but the characteristic barrel of the native state is absent. The 
transition from the metastable intermediate to the native basin of attraction, during which the 
barrel forms, is the rate-limiting step that occurs abruptly, with Q(t) reaching 0.8 (Figure 7A). 
Native state formation is signaled by the closure of the barrel and the accumulation of the long- 
range contacts between HI and the rest of the structure. Both R(t) and Rcif) decrease nearly 
continuously, and only in the final stages is there a precipitous drop in R(t) and A(t) (see inset 
in Figure 7A). The time dependence of A(t) shows that the root mean square deviation of the 
intermediate from the native state is about 20 A, whereas the final refolded structure deviates 
by only pa 3 A from the native conformation. Contact formation at the residue level (Figure 
7B) shows that interactions between /33 and /31 1 and between f31 and /36 are responsible for 
barrel closing. The assembly of GFP appears to be hierarchical in the sense that the secondary 
structural elements form prior to the establishment of the tertiary interactions. 

It is interesting to compare the force-quench refolding results to known pathways that have 
been inferred from kinetics of Cycle 3 GFP refolding from an acid-denatured state by using 
stopped-flow CD and fluorescence techniques (Enoki et al., 2004; Fukuda et al., 2000). Because 
the chemical structure of the p-hydroxybenzyli-deneimidazolidone chromophore remains intact 
in the acid-denatured state, they used the green fluorescence of the chromophore to monitor 
the formation of the native structure. Within the dead time of the instrument they observed 
a nonspecific collapse that in our simulations is manifested as a sharp decrease in RG (see the 
inset of Figure 7A). They proposed that there is partial secondary structure formation in the 
nonspecific collapse state. Such an interpretation was made in light of the increase in tryptophan 
fluorescence that showed that Trp57 (the only tryptophan in GFP) is solvent inaccessible. This 
implies that there is at least partial ordering of the structure around position 57. On the other 
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hand, the absence of chromophore fluorescence in this phase indicates that the chromophore 
(which lies more toward the C-terminal end of the chain than Trp57) is still solvent accessible 
due to a lack of formation of a rigid specific structure around it. In good agreement with these 
results, in our simulations we found that the first native contacts to form are those involving 
residues 1-65 (i.e., spanning the (31 — (33 region), which are in the neighborhood of Trp57 (see 
Figure 7B). 

The lag phase in the chromophore fluorescence was used to suggest the presence of an 
on-pathway intermediate that is more compact than the burst-phase intermediate. More 
importantly, in the intermediate many of the secondary structural elements of GFP (i.e., all 
of the (3 strands) are formed. However, the barrel is not yet present, as there is very little 
fluorescence from GFP (i.e., the chromophore is solvent accessible). It is likely that in this 
intermediate Trp57 is closer in space to the chromophore because its fluorescent signal is 
quenched. The long-lived metastable intermediate found during our GFP refolding simulations 
(see the structure in Figure 7A) has the same characteristics as the structure inferred from 
experiments. Our force-quench simulations show that all of the b strands are formed; therefore, 
Trp57 is close to the chromophore location. In addition, the characteristic barrel of GFP is 
not formed. Indeed, we predict the closing of the barrel to be the rate-determining step in 
the refolding. The explicit comparisons between simulation results and conclusions from bulk 
experiments that are based on interpretations of spectroscopic data show, just as for RNA, 
that, even for systems with complex architecture, refold-ing pathways can be reliably predicted 
using the SOP model. 



IV. CONCLUDING REMARKS 

The ability to monitor folding and unfolding of biopoly-mers, starting from arbitrary regions 
of the energy landscape, makes force spectroscopy unique. In recent years, LOT and AFM 
experiments have been used to produce FECs for large RNA and for proteins from which their 
energy landscapes have been constructed. In order to provide a structural interpretation of the 
FEC results, it is necessary to use simulations of mechanical unfolding of ribozymes and GFP 
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under loading conditions that are typically used in experiments. We have introduced the SOP 
model to study mechanical unfolding and force-quench refolding of large RNA molecules and 
proteins. After establishing the reliability of the model, we have made a number of testable 
predictions for both RNA and proteins by using the SOP model. For RNA and GFP, we find 
that the unfolding pathways are critically dependent on the pulling speed. The order of un- 
folding is determined by the ratio of the applied loading rate to the rate (a topology-dependent 
variable) at which force propagates along the molecules. Typically, at very high loading rates, 
unfolding occurs along a unique topology-dependent pathway, whereas as at lower rf excursions 
from the dominant pathway can be found. The crucial role played by the loading rate suggests 
that a detailed picture of the energy landscape can only be obtained by using a combination 
of LOT and AFM experiments. The form of the energy function used in the simulations 
is identical for both RNA and proteins. Despite the simplicity, the model reproduces the 
experimentally inferred order of unfolding in both RNA and proteins. In the case of GFP, the 
simulations clearly resolve the nature of the second intermediate and further predict that a 
third intermediate must be observable in the FEC. In addition, the SOP gives the structures of 
the intermediates that are almost impossible to obtain from experiments. Because the pulling 
speeds used in the GFP simulations are similar to those used in experiments, the predicted 
unbinding forces are in close agreement with measured values. The applications to ri-bozymes 
and enzymes show that, for certain problems, a unified perspective of the energy landscape 
governing folding of RNA and proteins can be obtained (Thirumalai and Hyeon, 2005). The 
SOP model was used to monitor the dynamics of refolding upon force quench. These simulations 
are important in interpreting, at the molecular level, experiments that can only obtain the 
dynamics of the end-to-end distance (R(t)) relaxation upon force quench. Our simulations, 
which identify structural details of the intermediates in the refolding of P4-P6 and GFP, clearly 
show that the routes explored in the mechanical unfolding process do not coincide with those in 
the refolding process. The SOP simulations have also been used to obtain the dynamics of that 
compaction process under folding conditions (/q = 0) that are extremely difficult to obtain 
from single-molecule experiments. Even time-resolved small-angle X-ray scattering experiments 
cannot currently resolve the behavior of RG at short time periods. The refolding of the P4-P6 
subdomain and GFP occurs by a hierarchical mechanism (Baldwin and Rose, 1999; Brion and 
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Westhof, 1997). In RNA, the separation of energy scales involving secondary and tertiary 
interactions is the reason for the hierarchical assembly. The force-quench refolding of GFP 
suggests that large proteins are more likely to follow hierarchical assembly than small, globular 
proteins. Perhaps, the hierarchical assembly mechanism restricts the severity of pausing 
for long periods of time in kinetic traps that, in the absence of energetic frustration, arise 
due to topological frustration (Thirumalai and Woodson, 1996). Future force-quench exper- 
iments, at the single-molecule level, would be invaluable in checking the predictions of this work. 

V. EXPERIMENTAL PROCEDURES 

Self-Organized Polymer Model We introduce a versatile coarse-grained structure-based 
model, referred to as a self-organized polymer (SOP) model, that can be adopted to describe 
forced unfolding of proteins and RNA. In order to simulate force-ramp and force-quench folding 
and unfolding of large RNA and proteins, under conditions that are close to those used in 
experiments, we are forced to employ coarse-grained, structure-based models, such as the SOP 
model (see below), that can be adopted to describe forced unfolding of proteins and RNA. Use 
of the SOP model to explore mechanical unfolding is justified for the following reasons. First, 
force-induced unfolding results in a 10-100 nm increase in the end-to-end distance (R). With 
the current spatial resolution in single-molecule forced-unfolding experiments, it is not possible 
to resolve the changes at the atomic level, i.e, it is not possible to resolve structural changes on 
length scales less than 1 nm. As a result, changes at small length scales are masked in mechanical 
unfolding experiments that only provide direct information on force-extension curves (FECs). 
We have, therefore, sought models that can reproduce experimental FECs as accurately as 
possible under loading conditions that are similar to those used in LOT and AFM experiments. 
Thus, while these models cannot describe changes at the atomic level (Pabon and Amzel, 2006; 
Gao et al., 2002), they are versatile enough to explore mechanical unfolding and force-quench 
refolding over a wide range of external conditions. In some sense, the present simulations should 
be viewed as a complement to the steered molecular dynamics simulations (Isra-lewitz et al., 
2001). Second, our previous work (Klimov and Thiruma-lai, 2000a) established that accurate 
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estimates of unfolding forces and the pulling speed variations of the most probable values of 
the disruption forces for proteins can be made by using the native topology alone. It follows, 
therefore, that as long as interactions that stabilize the native fold are adequately taken into 
account, many aspects of forced unfolding can be semiquantitatively predicted. 

With the above- described observations in mind, we introduce the SOP model that is suitable 
for accurately predicting the kinetic barrier in large RNA and proteins. We represent each 
nucleotide or a residue by using a single interaction center. The total energy function of a 
conformation, specified in terms of the coordinates ri (i = 1, 2, . . ., N), where N is the number 
of nucleotides or residues, is given by: 

V — V _i_ \/ATT , V REP 

Vt — VpENE + *NB + V NB 

Ek d2i ( r V+l ~~ r i.i+l) 2 x 
^oMi w . ) 

i=i u 

N-3 N o 

+ E E f -Aff-^ft\^ 

i 1 j=i+3 ' ' 
N-2 N 

i=l j=i+2 % i 

Finite extensible nonlinear elastic (FENE) potential describes the backbone chain connectivity. 
The distance between two neighboring interaction sites, i and i + 1 is r^+i and r° i+1 is its value 
in the native structure. We use the Lennard- Jones potential to account for the interactions that 
stabilize the native state. If the noncovalently linked beads i and j for \i — j\ > 2 are within a 
cut-off distance, RC (i.e., < Rc), then A^- = 1. If r\j > Rc, then A^- = 0. A uniform value 
for 3h, which specifies the strength of the nonbonded interactions, is assumed. All nonnative 
interactions (third term in Equation (1) are repulsive. To impose a constraint on the bond angle 
between i, i + 1, and i + 2 beads, the repulsive potential is used with parameters determining 
the strength, e^, and the range of repulsion, 0^1+2 ■ We used a^ + 2 = cr/2 for RNA and ai t i + 2 = cr 
for proteins. To prevent interchain crossing, we chose the appropriate value of a (see Table 1). 

There are six parameters in the energy function (see Table 1 for their values for proteins and 
RNA). Of these, the results are insensitive to the exact choices of R and k that account for chain 
connectivity. Similarly, the value of e^, which is introduced to emphasize native interactions and 
prevent chain overlap, is not critical. The native structure is sensitive to (more precisely, the 
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ratio th/ti) and R c . The range of values for R c is, to a large extent, dictated by the structures 
of RNA and proteins in the Protein Data Bank (PDB). In the SOP model, the parameter e^, 
plays a central role. 

A few comments about the model are needed. (1) The model differs from the off-lattice 
Go model (Clementi et al., 2000; Klimov and Thirumalai, 2000a) in two respects. First, 
chain connectivity potential in our model is nonlinear, whereas the Go model uses a harmonic 
potential. We chose the FENE potential because the FECs for RNA and proteins exhibit 
pronounced curvature at large forces, i.e., even after substantial loss of structure. The nonlinear 
FENE potential can reproduce the observed curvature in the FEC. Second, we neglect the 
dihedral angle degrees of freedom that describe local secondary structural preferences. Because 
of limitations in spatial resolution in LOT or AFM experiments, the disruption of secondary 
structures cannot be probed. Therefore, we do not include torsional potentials in Equation 
1. (2) We used a repulsive potential between nonnative interactions (third term in Equation 
1), whose range is different from the normally used inverse 12th potential. In our previous 
study (Klimov and Thirumalai, 2000b) of (5 hairpin formation, we had used a similar power 
to approximately mimic hydration effects. In the present context, as long as the range of the 
repulsive potential is short ranged, the precise power used is not relevant. We chose the inverse 
sixth power for purely practical purposes. The force-induced Brownian dynamics simulations 
are performed under nonequilibrium conditions. Under these conditions, we find that it is 
important that appropriate values of a and e\ (third term in Equation 1) be chosen to prevent 
RNA and proteins from nonphysical chain crossing. Furthermore, the simulations have to 
be performed for long times to observe global unfolding of the large ribozymes and proteins, 
especially at moderate and low pulling speeds. The need to follow the dynamics for long time 
periods requires us to choose a relatively large time step. For these practical reasons, we 
find that the longer-range repulsive potential (inverse sixth power), rather than the harsher 
shorter-range interactions (inverse 12th power), is more appropriate. Other forms of repulsive 
potential will not alter the results in any significant way. (3) The major limitation, especially 
in RNA applications, is that environmental changes (especially the role of counterions) are not 
adequately taken into account. These effects can be included in an approximate way by varying 

(or ej). 
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Simulations: Using the SOP model, we simulated the mechanical unfolding of proteins and 
RNA by using the Brownian dynamics algorithm (Ermack and McCammon, 1978; Klimov et 
al., 1998), for which the characteristic time for the overdamped motion is t h = ((ehh/k B T)r L , 
where tl = 4ps for RNA and t l = 3ps for proteins (Veitshans et al., 1997). After the 
experimental setup in AFM measurements, the C-terminal end is stretched at a constant 
pulling speed, while the N-terminal end is kept fixed. We chose k s = 35pN/nm, which is in 
the range of 1 — lOOpN/nm used in the AFM experiments. For RNA, since a typical value 
for the mass of a nucleotide, m, is ~ 300 — 400 g/mol; the average distance between the 
adjacent beads, a, is 5 A; the energy scale, e, is 0.7 kcal/mol, then the characteristic time is 
t l = (ma 2 /e h ) 1 / 2 = 3 — 5 ps. We use tl = 4. Ops to convert the simulation times into real times. 
To estimate the timescale for mechanical unfolding dynamics, we use the Brownian dynamics 
algorithm (Ermack and McCammon, 1978; Klimov et al., 1998), for which the characteristic 
time for the overdamped motion is tH. We used ( = 100r L 1 in the overdamped limit, and this 
value approximately corresponds to the friction constant for a molecule in water. All of the 
force simulations are performed at T = 300 K. For the integration time step, h = O.Iil, which 
implies that 10 6 integration time steps correspond to 47 ms. 

Dynamics of Rupture of Contacts: The time evolution of the rupture process is 
monitored by using the number of residues or nucleotide-dependent native contacts, Qi(t), that 
remain at t. We define Qi(t) = J2f(\j-i\>2) ®(-^c ~~ r ij(t))Aij, where Rc is the cut-off distance 
for native contacts, r^i) is the distance between the i-th and the j-th bead, and Ajj = 1 for 
native contact (otherwise A i3 - = 0). If a certain subdomain of the molecule is disrupted and 
loses its contacts, then the extension of the molecule suddenly increases and the mechanical 
force exerted on the end of the molecule drops instantly. These molecular events are reflected 
as rips in the FEC. By comparing the time dependence of the force, /(£), or the end-to-end 
distance, R(t), with Qi(t) by using t as a progressive variable to describe unfolding (see Figures 
1C and ID), we can unambiguously identify the structures involved in the dynamics of contact 
rupture. 
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TABLE I: Parameters for topology model of RNA and proteins. 





RNA 


protein 


Ro 


2A 


2A 


k 


20kcal/(mol ■ A 2 ) 


20kcal/(mol ■ A 2 ) 


Rc 


uA 


8k 


eh 


0.7kcal/mol 


(1 - 2) kcal/mol 


Q 


lkcal/mol 


lkcal/mol 


a 


7A 


3.8A 


c 


ioo^- 1 


SOr^ 1 


tl 


4ps 


3ps 
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VI. FIGURE CAPTIONS 



Figure Q] : Forced-Unfolding Dynamics of the Tetrahymena thermophila Ribozyme (A) 
Secondary structure of the Tetrahymena thermophila ribozyme. Each subdomain is specified 
from PI to P9. The four major tertiary interactions identified by the orange line are: (a) P2- 
P5c, (b) P2.1-P9.1a, (c) P5b-P6a, and (d) P5-P9. (B) Superposition of 51 FECs for the T. 
thermophila ribozyme at r = 1.88 3104 pN/s, where rj = kv (v = 5Afim/s, k = 3.5pN/nm). 
The value of r/ is about 3780 times greater than in the work by Onoa et al. (2003). The arrows 
indicate the rips. (C and D) The number of rips varies between six and eight. The sharp peak 
preceding the first arrow (also in [C] and [D]) corresponds to unbinding of the extended 30 strand 
from the rest of the P4 subdomain. This peak is absent in the LOT experiments because the 
30 end is shorter in the L-21 construct. (C) Superposition of FEC for 23 trajectories in which 
rupture begins with unraveling of the P9 helix. The arrows identify the position of the rips. 
The structures that unravel are explicitly indicated. The dynamics of disruption of individual 
contacts (Qi(t)) (lower panel) for 1 of the 23 trajectories in which unfolding begins with P9 
opening. The scale on the right in differing shades gives the number of contacts that survive at 
t. The structures in circles (a-d) are shown in (A), and the squares indicate interactions that 
stabilize the P3 pseudoknot. (D) Same as (C), except for this class of 28 trajectories, unfolding 
occurs by an alternate pathway in which the initial event is the opening of P2. The panel below 
gives Qi(t) for one of the trajectories. 

Figure [2] : Snapshots of the Structures in the Unfolding of the T. (A and B) The structures at 
various times along the pathways on the top and the bottom are obtained by the disappearance 
of the contacts Qi(t). In (A), the intermediates in the class of molecules in which P9 opens 
first is given; in (B), the structures from the alternate pathway in which P2 unravels initially 
are displayed. Along the pathway on the bottom, the P2.1-P9.1a tertiary contact is clearly 
observed in the structure at t = 13.3ms. The structures in (A) and (B) are for the molecules 
whose contact rupture history is give in the two panels below Figures 1C and ID, respectively. 

Figure [3] : Loading Rate-Dependent Tension Propagation and Mechanical Unfolding of the 
Azoarcus Ribozyme (A) Secondary structure of the Azoarcus ribozyme. (B) Force-extension 
curves of the Azoarcus ribozyme at three r^s (v = 43/im/ s, k s = 28pN/nm in red, v = 12.9/xm/s, 
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k s = 28pN/nm in green, and v = 5Afim/s, k s = 3.5pN/nm in blue). (C) Contact rupture 
dynamics at three loading rates. The rips, resolved at the nucleotide level, are explicitly labeled. 
(D) Topology of the Azoarcus ribozyme in the SOP representation. The first and the last 
alignment angles between the bond vectors and the force direction are specified. (E) Time 
evolutions of cos#j (i = 1, 2, . . ., N-l) at three loading rates are shown. The values of cosqi are 
color-coded as indicated on the scale shown on the right of the bottom panel. (F) Comparisons 
of the time evolution of cos 6*.; (blue) and cos#jv_i (red) at three loading rates shows that the 
differences in the fc values at the opposite ends of the ribozyme are greater as rj increases. 

Figure [4] : Force-Quench Refolding of the P4-P6 Domain (A) Refolding dynamics of the 
P4-P6 domain of the T. thermophila ribozyme upon force quench starting from a fully stretched 
state monitored by Q(t). The structure of the intermediate is explicitly shown. (B) Dynamics of 
the end-to-end distance, R(t) (black); the radius of gyration, R(t)/R^ (green) (Rq = 2.98nm 
for the native structure); and the root mean square deviation, A(t) (blue), with respect to 
the native state. In this trajectory, the helices form in ~ 1.5ms, and the transition from a 
metastable to a native structure occurs in ~ 4ms. The minimum A at long times is 6.4 A. (C) 
Dynamics of contact formation for the trajectory in (A). The early stage of folding is related 
to the hairpin loop formation (enclosed in the red box) that is separately shown on the right. 
Zipping of the secondary structure propagates from around P5b, P5c, and P6b. Formations of 
the tertiary contacts (w6 ms) in P5a/P5c-P4 and in P5b-P6a are shown in circles. 

Figure [5] : Dominant Force-Induced Unfolding Pathway of GFP (A) Native structure of 
GFP (chain A from PDB code: lgfl; 230 residues). (B) Two-dimensional connectivity map of 
GFP (top view of (A)). (C) Force-extension curve (FEC) at the pulling speed of v = 2.5/im/s 
and k s = 35pN/nm. The secondary structural elements that unravel at the rips are explicitly 
indicated. The purple arrow indicates that strands (31 — (33 transiently form before further 
unfolding. (D) Dynamics of rupture of contacts formed by each residue, Qiif), corresponding 
to the FEC in (C). The scale on the right in different shades gives the number of contacts that 
remain at time t. (E) The structures involved in the unfolding pathway labeled (i)-(iv) are 
shown by using the color code: (32, purple; (33, purple; (3A, green; /35, green; (36, green; (37 , 
red; (38, red; (39, red; (310, cyan; and (311, yellow. The N-terminal helix, HI, is in pink. The 
secondary structure elements that unfold together upon application of force at the C-terminal 
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end are shown in the same color. 

Figure [6] : Force-Induced Unfolding of GFP: Minor Pathway (A) FEC for GFP unfolding by 
an alternate pathway. The secondary structural elements that are disrupted in the early stages 
are explicitly indicated. (B) Structures that are populated in the transition to the stretched 
states for the trajectory in (A). 

Figure [7] : Refolding of GFP upon Force Quench (A) Dynamics of approach to the native 
state upon force quench is monitored by Q{t). The inset shows the time dependence of R(t), 
Rc{t), and A(t). (B) Time-dependent formation of the native contacts at the residue level 
during refolding from stretched GFP. 
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